Deqing Fu

profile.jpg

This is Deqing Fu and I’m a fourth-year Ph.D. candidate in Computer Science at the University of Southern California (USC). My main research interests are deep learning theory, natural language processing, and the interpretability of AI systems. I’m (co-)advised by Prof. Vatsal Sharan of USC Theory Group and Prof. Robin Jia of Allegro Lab within USC NLP Group, and I’m working closely with Prof. Mahdi Soltanolkotabi and Prof. Shang-Hua Teng. Before USC, I completed my undergraduate degree in Mathematics (with honors) and my master’s in Statistics at the University of Chicago.

My research focuses on understanding large language models from algorithmic and theoretical perspectives, as well as developing methods for multimodal learning and synthetic data generation. You can find my publications on Google Scholar and my recent CV here

Algorithmic Perspectives on Large Language Models
  • Transformers and in-context learning: does it implement gradient descent? (NeurIPS 2024)
  • Arithmetic in pretrained LLMs: memorization vs. mechanisms? (NeurIPS 2024, arXiv 2025)
  • What distinguishes Transformers from other architectures? (ICLR 2025)
  • Decision theory for LLM reasoning under uncertainty (ICLR 2025 Spotlight)
Synthetic Data and Multimodal Learning
  • Modality sensitivity in Multimodal LLMs (COLM 2024)
  • VLM feedback for Text-to-Image generation (NAACL 2025)
  • Token-level reward models (TLDR) for reducing hallucinations (ICLR 2025)

News

Oct 22, 2025 New preprint: When Do Transformers Learn Heuristics for Graph Connectivity? ArXiv.
Jul 23, 2025 Three new preprints (Multimodal Steering, Resa, and Zebra-CoT)
Jul 22, 2025 Check out our new paper Zebra-CoT on interleaved text and visual reasoning! Dataset and model are available on Hugging Face 🤗.
May 22, 2025 Talk at Stanford NLP Seminar. Slides here.
May 08, 2025 Talk at UChicago/TTIC NLP Seminar.

Selected Publications

See full list or Google Scholar for all publications.

2025

  1. arXiv
    graph.png
    When Do Transformers Learn Heuristics for Graph Connectivity?
    Qilin Ye*Deqing Fu*Robin Jia, and Vatsal Sharan
    In arXiv, 2025
    *Equal Contribution
  2. arXiv
    zebra-cot.png
    Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
    In arXiv, 2025
    *Equal Contribution
  3. arXiv
    saegull.png
    Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models
    Woody Haosheng Gan*Deqing Fu*, Julian Asilis*Ollie Liu*, Dani Yogatama, Vatsal SharanRobin Jia, and Willie Neiswanger
    In arXiv, 2025
    *Equal Contribution
  4. ICLR
    tldr.png
    TLDR: Token-Level Detective Reward Model for Large Vision Language Models
    Deqing Fu, Tong Xiao , Rui Wang , Wang Zhu, Pengchuan Zhang, Guan Pang, Robin Jia , and Lawrence Chen
    In International Conference on Learning Representations (ICLR), 2025
  5. ICLR
    sensitivity.png
    Transformers Learn Low Sensitivity Functions: Investigations and Implications
    Bhavya Vasudeva*Deqing Fu*Tianyi Zhou, Elliot Kau , You-Qi Huang, and Vatsal Sharan
    In International Conference on Learning Representations (ICLR), 2025
    *Equal Contribution
  6. ICLR
    dellma.png
    DeLLMa: Decision Making Under Uncertainty with Large Language Models
    Ollie Liu*Deqing Fu*, Dani Yogatama, and Willie Neiswanger
    In International Conference on Learning Representations (ICLR), 2025
    Spotlight (Top 5.1%), *Equal Contribution
  7. NAACL
    dreamsync.png
    DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback
    Jiao Sun*Deqing Fu*Yushi Hu* , Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd Steenkiste, Ranjay Krishna, and Cyrus Rashtchian
    In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025
    *Equal Contribution

2024

  1. NeurIPS
    transformer-icl.png
    Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression
    Deqing Fu , Tian-Qi Chen, Robin Jia, and Vatsal Sharan
    In Conference on Neural Information Processing Systems (NeurIPS), 2024
    SoCalNLP Symposium 2023 Best Paper Award
  2. NeurIPS
    fourier.png
    Pre-trained Large Language Models Use Fourier Features to Compute Addition
    Tianyi ZhouDeqing FuVatsal Sharan, and Robin Jia
    In Conference on Neural Information Processing Systems (NeurIPS), 2024
  3. COLM
    isobench.png
    IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations
    Deqing Fu*, Ruohao Guo*, Ghazal Khalighinejad*Ollie Liu*, Bhuwan Dhingra, Dani Yogatama, Robin Jia, and Willie Neiswanger
    In Conference on Language Modeling (COLM), 2024
    *Equal Contribution